Collocation Extraction Using Web Statistics
نویسندگان
چکیده
This paper mines collocations from two different web usage corpora, NTU proxy log and TTS search log. The precisions for NTU and TTS test data are 61.76% and 57.50%, respectively, by human judgment for 2% sampling of extracted collocations. For automatic evaluation, we submit extracted collocation to Google search engine, and the resulting page counts are used to compute the mutual information of the collocation. Experimental results show that total 43.27% and 42.65% of collocations mined from NTU and TTS corpora passed the examination of MIs.
منابع مشابه
Using Collocation Statistics in Information Extraction
Our main objective in participating MUC-7 is to investigate and experiment with the use of collocation statistics in information extraction. A collocation is a habitual word combination, such as \weather a storm", \ le a lawsuit", and \the falling yen". Collocation statistics refers to the frequency counts of the collocational relations extracted from a parsed corpus. For example, out of 6577 i...
متن کاملWeb corpora for bilingual lexicography . A pilot study of English / French collocation extraction and translation
متن کامل
A Mobile Touchable Application for Online Topic Graph Extraction and Exploration of Web Content
We present a mobile touchable application for online topic graph extraction and exploration of web content. The system has been implemented for operation on an iPad. The topic graph is constructed from N web snippets which are determined by a standard search engine. We consider the extraction of a topic graph as a specific empirical collocation extraction task where collocations are extracted b...
متن کاملExtracting Academic Subjects Semantic Relations Using Collocations
The paper presents approach to analyze semantic content of academic subjects and its internal relations using statistically-based techniques for collocation extraction from large electronic educational text corpus. It offers a survey and analysis of some related corpus-based approaches to extract conceptual relations used for educational purpose and presents a technique for semantic search of c...
متن کاملExploratory Search on the Mobile Web
We present a mobile touchable application for online topic graph extraction and exploration of web content. The system has been implemented for operation on a tablet computer, i.e. an Apple iPad, and on a mobile device, i.e. Apple iPhone or iPod touch. The topics are extracted from web snippets which are determined by a standard search engine. We consider the extraction of topics as a specific ...
متن کامل